Search CORE

MDC Repository

Mining the diseasome

Author: AK Daly
AL Barabási
Davnah Urbach
FS Collins
G Jimenez-Sanchez
Jason H Moore
KI Goh
LA Hindorff
ML Freedman
R Cowper-Sal Iari
SSSJ Ahmed
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Lund University Publications

Genomic Position Mapping Discrepancies of Commercial SNP Chips

Author: AO Schmitt
AV Zimin
CA Anderson
CG Elsik
Christian Bendixen
D Fredman
F Pereyra
João Fadista
K Pelak
LA Hindorff
Michael Edward Zwick
MR Ho
R Luethy
S Macgregor
SF Altschul
Z Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate and unambiguous. We report on genomic positional discrepancies of various SNP chips for human, cattle and mouse species, and discuss their causes and consequences

CiteSeerX

FigShare

ParallABEL: an R library for generalized parallelization of genome-wide association studies

Author: F Dudbridge
G Vera
H Mishima
J Hill
K Misawa
L Ma
LA Hindorff
NM Laird
Pichaya Tandayya
R Ihaka
RM Plenge
Surakameth Mahasirimongkol
TA Pearson
Unitsa Sangket
Wasun Chantratita
YS Aulchenko
Yurii S Aulchenko
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files.Results: Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors.Conclusions: Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL

Erasmus University Digital Repository

Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data

Author: AP Morris
B Li
BE Madsen
C Dering
DE Reich
DJ Smith
F Han
Hongyu Zhao
J Graham
JK Pritchard
JK Pritchard
JN Hirschhorn
John Ferguson
Joon Sang Lee
LA Almasy
LA Hindorff
Lun Li
NJ Schork
P Donnelly
R Tibshirani
SK Iyengar
Wei Zheng
Xiting Yan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Association studies using tag SNPs have been successful in detecting disease-associated common variants. However, common variants, with rare exceptions, explain only at most 5–10% of the heritability resulting from genetic factors, which leads to the common disease/rare variants assumption. Indeed, recent studies using sequencing technologies have demonstrated that common diseases can be due to rare variants that could not be systematically studied earlier. Unfortunately, methods for common variants are not optimal if applied to rare variants. To identify rare variants that affect disease risk, several investigators have designed new approaches based on the idea of collapsing different rare variants inside the same genomic block (e.g., the same gene or pathway) to enrich the signal. Here, we consider three different collapsing methods in the multimarker regression model and compared their performance on the Genetic Analysis Workshop 17 data using the consistency of results across different simulations and the cross-validation prediction error rate. The comparison shows that the proportion collapsing method seems to outperform the other two methods and can find both truly associated rare and common variants. Moreover, we explore one way of incorporating the functional annotations for the variants in the data that collapses nonsynonymous and synonymous variants separately to allow for different penalties on them. The incorporation of functional annotations led to higher sensitivity and specificity levels when the detection results were compared with the answer sheet. The initial analysis was performed without knowledge of the simulating model

Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS

Author: A Gerrits
AL Stark
AS Dimas
D Levy
Dan L. Nicolae
DB Goldstein
DJ Klionsky
DL Nicolae
E Choy
EE Schadt
EE Schadt
EE Schadt
ER Gamazon
Eric Gamazon
GR Abecasis
Greg Gibson
HR Coleman
I Hovatta
J Hampe
JC Barrett
JN Hirschhorn
K Bullaughey
L Shi
LA Hindorff
LA Hindorff
M Comabella
M. Eileen Dolan
MJ Cowley
Nancy J. Cox
P Kraft
RA Irizarry
S Duan
S Purcell
Shiwei Duan
TA Manolio
V Emilsson
Wei Zhang
Publication venue: Public Library of Science
Publication date: 01/04/2010
Field of study

Although genome-wide association studies (GWAS) of complex traits have yielded more reproducible associations than had been discovered using any other approach, the loci characterized to date do not account for much of the heritability to such traits and, in general, have not led to improved understanding of the biology underlying complex phenotypes. Using a web site we developed to serve results of expression quantitative trait locus (eQTL) studies in lymphoblastoid cell lines from HapMap samples (http://www.scandb.org), we show that single nucleotide polymorphisms (SNPs) associated with complex traits (from http://www.genome.gov/gwastudies/) are significantly more likely to be eQTLs than minor-allele-frequency–matched SNPs chosen from high-throughput GWAS platforms. These findings are robust across a range of thresholds for establishing eQTLs (p-values from 10−4–10−8), and a broad spectrum of human complex traits. Analyses of GWAS data from the Wellcome Trust studies confirm that annotating SNPs with a score reflecting the strength of the evidence that the SNP is an eQTL can improve the ability to discover true associations and clarify the nature of the mechanism driving the associations. Our results showing that trait-associated SNPs are more likely to be eQTLs and that application of this information can enhance discovery of trait-associated SNPs for complex phenotypes raise the possibility that we can utilize this information both to increase the heritability explained by identifiable genetic factors and to gain a better understanding of the biology underlying complex traits

CiteSeerX

Meta-eQTL: a tool set for flexible eQTL meta-analysis

Author: AA Shabalin
Antonio Fabio Di Narzo
AP Boyle
B Howie
CJ Willer
DM Greenawalt
EE Schadt
Haoxiang Cheng
Jianwei Lu
K Hao
Ke Hao
L Liang
LA Hindorff
S Sanna
V Emilsson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

arXiv.org e-Print Archive

Rare coding SNP in DZIP1 gene associated with late-onset sporadic Parkinson's disease

Author: A Lerner
A Ruepp
AA Merchant
AM Glazer
B Bakir-Gungor
B Dass
C Wolff
CB Do
CH Hawkes
D Subramaniam
DAD Monte
DI Chasman
E Eskin
FL Moore
H-C Fung
HR Kim
J Simón-Sánchez
K Lai
K Roeder
K Roeder
K Roeder
K Sekimizu
K Tsuboi
L Lum
LA Hindorff
LM Bekris
M Bak
M Plaisant
M Saad
M-X Li
N Miao
O Bragina
P Mill
P Whitton
PA Beachy
PW Ingham
RE Lamont
S Purcell
SM Chambers
SY Tay
TA Manolio
TH Hamza
TL Edwards
TS Keshava Prasad
V Palma
VF Rafuse
W Satake
Y Katoh
Publication venue
Publication date: 11/09/2011
Field of study

We present the first application of the hypothesis-rich mathematical theory to genome-wide association data. The Hamza et al. late-onset sporadic Parkinson's disease genome-wide association study dataset was analyzed. We found a rare, coding, non-synonymous SNP variant in the gene DZIP1 that confers increased susceptibility to Parkinson's disease. The association of DZIP1 with Parkinson's disease is consistent with a Parkinson's disease stem-cell ageing theory.Comment: 14 page

Estudo Geral

Type 2 diabetes genetic association database manually curated for the study design and odds ratio

Author: AD Johnson
Bermseok Oh
Hun Kuk Park
Hyun-Seok Jin
JE Shaw
Ji Eun Lim
Kyung-Won Hong
LA Hindorff
MD Mailman
RK Campbell
S Agrawal
S Kim
S Wild
W Yu
W Yu
W Yu
Yang Seok Kim
Publication venue: BioMed Central
Publication date: 01/12/2010
Field of study

Abstract Background The prevalence of type 2 diabetes has reached epidemic proportions worldwide, and the incidence of life-threatening complications of diabetes through continued exposure of tissues to high glucose levels is increasing. Advances in genotyping technology have increased the scale and accuracy of the genotype data so that an association genetic study has expanded enormously. Consequently, it is difficult to search the published association data efficiently, and several databases on the association results have been constructed, but these databases have their limitations to researchers: some providing only genome-wide association data, some not focused on the association but more on the integrative data, and some are not user-friendly. In this study, a user-friend database of type 2 diabetes genetic association of manually curated information was constructed. Description The list of publications used in this study was collected from the HuGE Navigator, which is an online database of published genome epidemiology literature. Because type 2 diabetes genetic association database (T2DGADB) aims to provide specialized information on the genetic risk factors involved in the development of type 2 diabetes, 701 of the 1,771 publications in the type 2 Diabetes case-control study for the development of the disease were extracted. Conclusions In the database, the association results were grouped as either positive or negative. The gene and SNP names were replaced with gene symbols and rsSNP numbers, the association p-values were determined manually, and the results are displayed by graphs and tables. In addition, the study design in publications, such as the population type and size are described. This database can be used for research purposes, such as an association and functional study of type 2 diabetes related genes, and as a primary genetic resource to construct a diabetes risk test in the preparation of personalized medicine in the future.</p